20 - Execution Stream Fingerprinting for Low-cost Safety-critical System Design [ID:3682]
50 von 1082 angezeigt

The following content has been provided by the University of Erlangen-Nürnberg.

Okay, there we go again. So it's great to be back here in Erlangen. When I was here

last I had just started my postdoc at the University of Virginia talking about the work

that I was doing as a PhD student. That was lifetime reliability and design for manufacturability,

system level design type stuff. Today I'm going to be talking about something completely

different. I now have some research in safety critical system design. And fingerprinting

I have found is a very poor choice of a word, but now I'm stuck with it because I've done

some publishing using the word fingerprinting. Not talking about your fingerprints, talking

about compression for the purposes of easy redundancy checking. That's what I mean when

I say fingerprinting. So we'll see what exactly we're talking about in just a second there,

but wanted to clear that up so that you didn't start my talk with some wrong idea about what

I'm going to spend the next 45 minutes talking about. Technical difficulties abound. Okay,

I'll just use my finger instead. So safety critical systems are becoming a part of our

everyday life. Going back a couple of decades we've had computers integrated with our cars.

Now we're seeing drive by wire where when you push your foot on the brake that's not

the only input that might possibly control the brake. And very soon we'll see steering

by wire and other such things. But there's safety critical systems in far more places

too. We had Phillip's talk earlier about medical devices. So there's more and more computers

and medical devices. And what's interesting about automotive domain and the medical system

domain is that we want low cost reliability. Obviously we want reliability because these

systems could affect whether we live or die. And embedded system security is demonstrating

that it is very easy actually to kill someone that is in a car because you can hack the

car and so on. So I'm not going to talk about security today. Obviously we want these systems

to be safe. Automotive and medical systems are different from aerospace systems because

in an aerospace system you're spending hundreds of millions of dollars, most of it not on

the computers that go into it. You can spend whatever you want on the computers in an airplane

because it's just a fraction of the cost of an airplane. But in cars if we can reduce

the cost of the electronics in the car by a dollar or two and then you ship 20 million

cars then all of a sudden you've had a very significant impact on the cost of manufacturing,

maintaining and all these other things. When we talk about reducing cost though you might

get scared because you don't want it to become less safe. So the question today facing automotive

system designers, health system designers is how can we achieve the same level of reliability

or even better reliability while shaving costs or increasing the number of features that

we have. So just as a little bit of background, when we talk about reliability in an automotive

domain or a health system domain really what we are most concerned about is transient upset.

So transient upsets are caused by either radiation from the packaging for the most part or cosmic

radiation. So high energy particles that are coming from the sun that don't happen to be

filtered by our atmosphere for instance. And what happens is you have a particle strike

some silicon atom in the silicon lattice and then you have charge ionization as a result.

So all the energy from the particle is transferred into the silicon lattice which shakes a bunch

of electrons free essentially. These electrons then are collected by a diffusion area in

one of your transistors and what this can do is you can have a current spike. You can

have a current spike that occurs in your transistor and as a result you have some computation

you are supposed to get a one instead you get a zero or vice versa. Now what's interesting

about transistor scaling which has enabled so many of these fabulous applications like

putting 100 microprocessors in your car or putting semiconductors in a band aid for instance

like we were talking about this morning. The problem with transistor scaling is that when

we make a transistor half the size the amount of charge required to disrupt it also goes

down by half. So what we see when we move from 180 nanometer technology to 16 nanometer

technology is that the failure rate for a single bit goes up by sorry the failure rate

Presenters

Prof. Dr. Brett Meyer Prof. Dr. Brett Meyer

Zugänglich über

Offener Zugang

Dauer

01:05:43 Min

Aufnahmedatum

2014-03-21

Hochgeladen am

2014-03-28 14:54:08

Sprache

de-DE

Prof. Brett Meyer (McGill University, Canada)

Recently, the combination of semiconductor manufacturing technology scaling and pressure to reduce semiconductor systemcosts and power consumption has resulted in the development of computer systems responsible for executing a mix ofsafety-critical and non-critical tasks. However, such systems are poorly utilized if lockstep execution forces all processor coresto execute the same task even when not executing safety-critical tasks. Execution fingerprinting has emerged as analternative to n-modular redundancy for verifying redundant execution without requiring that all cores execute the same taskor even execute redundant tasks concurrently. Fingerprinting takes a bit stream characterizing the execution of a task andcompresses it into a single, fixed-width word or fingerprint.
Fingerprinting has several key advantages. First, it reduces redundancy-checking bandwidth by compressing changes toexternal state into a single, fixed-width word. Second, it reduces error detection latency by capturing and exposingintermediate operations on faulty data. Third, it naturally supports the design of mixed criticality systems by making dual-,triple-, and n-modular redundancy available without requiring significant architectural changes. Fourth, while it can’tguarantee perfect error detection, error detection probabilities and latencies can be tuned to a particular application.Together, these advantages translate to improved performance for mixed-criticality systems.
In this talk, I will describe fingerprinting in safety-critical systems and explore the various trade-offs inherent in itsapplication at the architectural level and choices related to fingerprinting subsystem design, including: (a) determining whatapplication data to compress, as a function of error detection probability and latency, and (b) identifying a correspondingfingerprinting circuit implementation.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen